Communications system

ABSTRACT

A communications system comprising a plurality of islands, a media path comprising resources for carrying data in a plurality of calls between first and second ones of the islands in which each of the first and second islands comprises control means for managing allocation of the resources of the media path between the plurality of calls; in which the system also comprises means for detecting a faulty control means and either replacing the faulty control means with a working replacement control means or recovering the faulty control means to working order; in which the system also comprises means for providing from a further control means to the replacement or recovered control means on replacement or recovery information on the allocation of the resources of the media path.

[0001] The present invention is directed to the field of communications in general and to a method of recovering from a failure in a communications system in particular.

[0002] Multimedia communications services over Packet Based Networks (PBN) which may not provide a guaranteed Quality of Service are described in ITU-T Recommendation H.323 (February 1998). The packet based network over which H.323 entities communicate may be point-to-point connections, a single network segment, or an internetwork having multiple segments and possibly complex topologies. ITU-T Recommendation E.164 (May 1997) describes the international public telecommunication numbering plan.

[0003] A typical IP telephony system comprises a plurality of hosts interconnected via a backbone network composed of a number of routers to which the plurality of hosts are connected. These hosts are grouped in network “islands” which have high bandwidth available between all hosts in an island. These islands of high bandwidth are interconnected by the “backbone” network comprising a number of links of known but limited bandwidth between pairs of islands. The bandwidth available on a link between two islands will not generally be sufficient to carry all the telephony traffic between those islands which the hosts in the islands could, in theory, generate. Associated with each inter-island link there are therefore a pair of hosts, one at each end of the inter-island link, which perform an Admission Control Function (ACF). When a host in an island wants to use bandwidth on an inter-island link, it must first be granted permission by the local host performing the ACF for that inter-island link. The ACF ensures that the link bandwidth is never over-committed. If the bandwidth would be over-committed by granting permission for more bandwidth use, then permission is denied. In H.323 based networks, the admission control function is contained within the H.323 gatekeeper. The guaranteed Quality Of Service required for correct transmission of telephony traffic through restricted-bandwidth inter-island links thus depends on correct operation of the ACF. To correctly control traffic on the backbone network links, the ACF at both ends of an inter-node link must hold the same information about link usage. In normal operation, this is achieved by synchronisation of the two ACFs achieved by way of inter-host signalling.

[0004] If the ACF in a host at one end of a link fails, the media (traffic) carried by that link will continue to flow (i.e. existing calls will continue). The island would normally be provided with a spare ACF that can be brought into service to replace the failed one. This new ACF will assume the role of the failed one, taking over control of its inter-island links. The replacement ACF will have no knowledge of the link resource allocations in effect immediately before the failure of its predecessor. Although it will gradually obtain knowledge of the true allocation state through updates from inter-island signalling generated when existing allocations are released, it is not in a position on coming into service to immediately authorise new allocation requests. The time to recover is directly related to the duration of resource allocations handled by the ACF. In an IP telephony network the allocations of inter-island resources are associated with calls. The ACF will not fully regain control of the inter-island link until all calls in progress at the time of failure have cleared. Due to the potentially long call hold time, the ACF will not be fully in control for an unacceptably long time following failure.

[0005] The present invention provides a communications system comprising a plurality of islands, a media path comprising resources for carrying data in a plurality of calls between first and a second ones of the islands in which each of the first and second islands comprises control means for managing allocation of the resources of the media path between the plurality of calls; in which the system also comprises means for detecting a faulty control means and either replacing the faulty control means with a working replacement control means or recovering the faulty control means to working order; in which the system also comprises means for providing to the replacement or recovered control means on replacement or recovery information on the allocation of the resources of the media path.

[0006] The present invention further provides a method of managing communications in a communications system comprising a plurality of islands and a media path comprising resources for carrying data in a plurality of calls between first and second ones of the islands in which each of the first and second islands comprises control means for managing allocation of the resources of the media path between the plurality of calls; the method comprising the steps of detecting a faulty control means and either replacing the faulty control means with a working replacement control means or recovering the faulty control means to working order; providing to the replacement or recovered control means on replacement or recovery information on the allocation of the resources of the media path.

[0007] Embodiments of the present invention will now be described by way of example with reference to the drawings in which:

[0008]FIG. 1 shows in diagrammatic form an IP telephony system of the prior art;

[0009]FIG. 2 shows a representation of a message sequence according to the present invention.

[0010]FIG. 1 shows an IP communications network comprising a plurality of islands A, B, interconnected via backbone network C comprising links (not shown) of limited capacity. Calls between the islands are controlled by a call control function (CCF) K,L in each island. Allocation of link resources to multiple calls between the islands is controlled by an ACF D,E within each island. Each island also contains a replacement ACF F,G that will take over in the event of failure of the original ACF. Island A also comprises one or more IP telephony terminals represented by terminal H and island B also comprises one or more IP telephony terminals represented by terminal J.

[0011] The sequence of events followed in setting up and clearing down a successful call from IP telephony terminal H to IP telephony terminal J will now be described with reference to FIG. 2, as follows:

[0012] 1. The calling terminal H contacts local call control function K, and supplies an identifier (for example an E.164 number) for the terminal it wishes to call (e.g. terminal J);

[0013] 2. CCF K determines that the called terminal J is in another island B, and that the appropriate path M for media to pass between the two islands is via backbone network C;

[0014] 3. CCF K contacts the ACF D associated with the local end of inter-island path M to request allocation of resources within path M to handle the call;

[0015] 4. ACF D confirms that the required resources are available, and records the resources (e.g. bandwidth on path M) allocated for the call. If insufficient resources are available, the call attempt is rejected;

[0016] 5. If sufficient resources are available, CCF K contacts its peer CCF L in island B to extend the call toward its destination. CCF L then contacts its local ACF E associated with the local end of the inter-island path M to request allocation of resources within path M to handle the call. Thus the ACFs at both ends of the link have to allocate the same resources on the link needed to send the call over backbone network C;

[0017] 6. ACF E confirms that the required resources are available, and records the resources allocated for the call. However, if insufficient resources are available, the call attempt is rejected by ACF E. This could occur if calls are set up simultaneously in both directions. In this case both ACFs D and E will grant a requesting call permission to leave their own island, based on the known bandwidth available, and independently make a local note that this bandwidth has now been committed. However one or both (this depends on exact timing and bandwidth requested) of the calls could then be rejected by the ACF receiving the call at the other end if the receiving ACF has in the mean time committed bandwidth to the simultaneous outgoing call going in the other direction and, as a result, doesn't have enough bandwidth remaining;

[0018] 7. If sufficient resources are available, call CCF L contacts the destination terminal J to extend the call to its destination and complete setup of the call. Data may now flow between the terminals H, J using the resources that have just been allocated in path M.

[0019] At the end of the call one or other of the terminals H, J inform their local call control function K,L respectively) that they wish to terminate the call. The local CCF in turn contacts the local ACF, D or E, to free the resources allocated in path M for that call. Termination of the call in islands A, B may be synchronised by communication between the two CCFs K, L.

[0020] If ACF E fails while calls are in progress, it is rapidly replaced by replacement ACF G. However, ACF G does not have a copy of the current resource allocation data that was being used by ACF E prior to failing, so when replacement ACF G is called upon to allocate resources it cannot be sure whether the required resources are currently available.

[0021] Whenever an existing call is cleared, the replacement ACF G will receive a de-allocation request informing it of any freeing of resources required. In existing protocols the de-allocation request simply refers to the original allocation request without repeating details of the allocation. This means that the replacement ACF G cannot determine how much of the resources to release as calls clear.

[0022] Since both ACFs D and E were informed of all calls traversing the inter-island path M, they both have similar resource allocation data. According to the present invention, the replacement ACF G communicates with its peer ACF D at the other island A to retrieve information on the current resource allocations against the inter-island path M.

[0023] In existing protocols individual resource allocations (e.g. as used in communications between the CCF and ACF within a island) are only meaningful within that island. Thus it is not possible to simply transfer resource allocation information from one ACF to its peer in another island as that island will be unable to correlate the transferred resource allocation information with calls handled by that island.

[0024] As illustrated in FIG. 2, when the replacement ACF G takes over from the failed ACF E, it sends a message to its peer ACF D. This message notifies ACF D that recovery is underway on resource allocations relating to all existing calls between its island and the island (in this case island B) containing the replacement ACF (i.e. G), and requests it to respond with a message to ACF G indicating the current level of unallocated resources on path M. When the unallocated resource level message is received at ACF G it can immediately start accepting new resource allocation requests and de-allocation requests from its local CCF L, and respond to these requests by making new allocations of resources from the pool of indicated unused resources.

[0025] On clearing calls for which the original allocation data in ACF E has been lost, a de-allocation request will be sent to the ACF D still operating in island A. This ACF performs its normal resource de-allocation actions, and in addition, if the allocation data indicates that the call is one of those in force at the time of the request message from the replacement ACF G, then it will send a message to ACF G indicating the quantity of resources that are being de-allocated. ACF G increases its recorded level of unused resource accordingly.

[0026] Messages from ACF D to ACF G continue to be sent until all resources that were allocated at the time of the failure of ACF E have been released. At this point both working ACFs D and G will have a complete set of corresponding resource allocation records and the recovery process will have completed. This is not to say that all resources must be free at any one time in order to achieve recovery. On the contrary, any resources freed since receipt of the request message from the replacement ACF G may be re-allocated by recovered ACF G at any time.

[0027] Although described above in terms of a replacement ACF, some systems will be able to recover a failed ACF and return it to service within an acceptable time such that replacement is not necessary. Alternatively, a replacement ACF may itself be replaced some time later by the recovered ACF. The present invention also applies to recovered ACFs where information of resource allocation may have been lost, or merely become inaccurate due to changes in resource allocation that occurred whilst the ACF was not functioning. 

1. A communications system comprising a plurality of islands, a media path comprising resources for carrying data in a plurality of calls between first and a second ones of the islands in which each of the first and second islands comprises control means for managing allocation of the resources of the media path between the plurality of calls; in which the system also comprises means for detecting a faulty control means and either replacing the faulty control means with a working replacement control means or recovering the faulty control means to working order; in which the system also comprises means for providing to the replacement or recovered control means on replacement or recovery information on the allocation of the resources of the media path.
 2. The system as claimed as in claim 1 in which the resources allocated to a call are released on termination of the call, in which the system also comprises information providing means for providing to the replacement or recovered control means information on released resources as they becomes available on the media path.
 3. The system as claimed in claim 2 in which the released resources comprise bandwidth made available by the termination of calls.
 4. The system as claimed in any one of claims 2 and 3, in which the replacement or recovered control means comprises means for requesting transfer of the information from the information providing means.
 5. The system of any above claim in which the source of the information to be provided is located in the other one of the first and second islands from the replacement or recovered control means.
 6. The system of any above claim in which the system also comprises means for co-ordination of the operation of the control means by communication between islands.
 7. The system of claim 6 in which each island is associated with a call control function (CCF); in which the communication of claim 6 between islands comprises communication between the CCFs associated with those islands.
 8. The system of claim 7 in which one of the CCFs is associated with each endpoint of each call; in which the communication comprises communication between the CCFs associated with the endpoints of each call.
 9. The system of any above claim in which the plurality of islands are interconnected via a backbone network in which the replacement or recovered control means is in a different island from source of the information to be provided.
 10. The system of any above claim in which the media path comprises at least in part a blocking section and in which the replacement or recovered control means is separated by the blocking section from the source of the information to be provided.
 11. The system of any above claim in which the replacement or recovered control means is connected via a router to the source of the information to be provided.
 12. The system of any above claim in which the source of the information to be provided comprises a further control means.
 13. The system of any above claim as dependent from claim 6 in which the CCFs are implemented according to internet protocol.
 14. The system of any above claim for carrying voice traffic as voice over internet protocol (VoIP).
 15. A method of managing communications in a communications system comprising a plurality of islands and a media path comprising resources for carrying data in a plurality of calls between first and second ones of the islands in which each of the first and second islands comprises control means for managing allocation of the resources of the media path between the plurality of calls; the method comprising the steps of detecting a faulty control means and either replacing the faulty control means with a working replacement control means or recovering the faulty control means to working order; providing to the replacement or recovered control means on replacement or recovery information on the allocation of the resources of the media path.
 16. The method of claim 15 including the steps of releasing the resources allocated to a call on termination of the call and providing information to the replacement or recovered control means on released resources as they become available on the media path.
 17. The method of claim 16 in which the released resources comprise bandwidth made available by the termination of calls.
 18. The method of any one of claims 16 and 17 including the step of the replacement or recovered control means requesting the information.
 19. The method of anyone of claims 15 to 18 in which the source of the information is located in the other one of the first and second islands from the replacement or recovered control means.
 20. The method of any one of claims 15 to 19 including the step of coordinating the operation of the control means by communication between islands.
 21. The method of claim 20 in which each island is associated with a call control function (CCF), in which the communication of claim 20 comprises communication between the CCFs associated with those islands.
 22. The method of claim 21 in which one of the CCFs is associated with each end point of each call, the method including the step of communicating between the CCFs associated with the end points of each call.
 23. The method of any of claims 15 to 22 in which the plurality of islands are interconnected by a backbone network in which the replacement or recovered control means is in a different island from the source of the information to be provided.
 24. The method of any one of claims 15 to 23 in which the media path comprises at least in part a blocking section and in which the replacement or recovered control means is separated by the blocking section from the source of the information.
 25. The method of any one of claims 15 to 24 in which the replacement or recovered control means is connected via a router to the source of the information.
 26. The method of any one of claims 15 to 25 in which the source of the information comprises a further control means.
 27. The method of any one of claims 15 to 26 as dependent from claim 21 including the step of implementing the CCFs according to internet protocol.
 28. The method of any one of claims 15 to 27 including the step of carrying voice traffic as voice-over internet protocol (VoIP). 